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Abstract 

> 

Consider a joint quantum state of a system and its environment. A measure- 
ment on the environment induces a decomposition of the system state. Using 



O 



£N| | algorithmic information theory, we define the preparation information of a pure or 

mixed state in a given decomposition. We then define an optimal decomposition 
as a decomposition for which the average preparation information is minimal. The 
average preparation information for an optimal decomposition characterizes the 
system-environment correlations. We discuss properties and applications of the 
concepts introduced above and give several examples. 

cr 1 Introduction 



x 



It is a distinctive feature of quantum mechanics that more information is required to 
prepare an ensemble of nonorthogonal quantum states than can be recovered from the 
ensemble by measurements. Whereas the von Neumann entropy of the density operator 
of the ensemble is bounded above by the logarithm of the dimension of Hilbert space, 
log-D, the preparation information for a uniform ensemble of pure states is of the same 
order as D ||, |, |. 

An ensemble of quantum states is defined by a list of states together with their 
probabilities, {p r ,p r }- An ensemble can also be regarded as a decomposition of the 
average density operator, p = J^PrPr- Ensembles of quantum states of a system S arise 
in a natural way from the correlations of S with an environment £. Given the total 
state ptotai of the joint system S ® £ , any generalized measurement or POVM |§] on £ 
induces an ensemble on S. In this paper, we give a precise definition of the preparation 
information of a state in the ensemble induced by p tota i and an environment POVM. 

*E-mail: a.soklakov@rhbnc.ac.uk 
^E-mail: r.schack@rhbnc.ac.uk 
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The concept of preparation information leads naturally to a definition of optimal en- 
sembles or, equivalently, optimal density-operator decompositions. The average prepa- 
ration information of a state in an optimal ensemble is then a property of ptotai alone 
(given the split of the total Hilbert space into S and £). The average preparation infor- 
mation characterizes the system-environment correlations by the information about the 
environment needed to obtain a given amount of information about the system. 

Optimal decompositions in the sense defined here have been used for the investigation 
of optimal quantum trajectories in quantum optics ||. Quantum trajectories are defined 
as follows. In a typical quantum-optical experiment, the system consists of selected 
atoms and field modes inside an optical cavity, whereas the environment consists of the 
continuum of modes outside the cavity. The time evolution of the cavity state conditional 
on the results of, e.g., homodyne measurements outside the cavity defines a quantum 
trajectory || 0. For an alternative concept of optimality see Ref. ||. 

The average preparation information for an optimal ensemble has been proposed as a 
measure of quantum chaos @, [|. When a chaotic system interacts with its environment, 
one loses the ability to predict its time evolution. The preparation information quantifies 
the amount of information needed about the environment to keep the ability to predict 
the system state to a given accuracy. In conjunction with Landauer's principle [ID], this 



places a fundamental lower limit on the free-energy cost of predicting the time evolution 



of a dynamical system fll 



The paper is organized as follows. Section defines the concepts of preparation 
information and optimal ensembles, and derives some basic properties. In Sec. |3|, we 
illustrate the theory through several examples. Some mathematical details are deferred 
to Sec. 13. 



2 Preparation information and optimal ensembles 

Let D and Ds denote the Hilbert-space dimensions of the system S and the environment 
£, respectively. We will normally assume that Dg 3> D. Now consider a joint state 
Ptotai on S ® £. The state of the system alone, p, is then obtained by tracing out the 
environment, 

P = tr £ (p to tai) • (1) 
The von Neumann entropy of the system is 

# = -tr(plogp) , (2) 

where here and throughout this paper, log denotes the base-2 logarithm. We now perform 
an arbitrary measurement on the environment 0], described by a POVM, {E r }, where 
the E r are positive environment operators such that 

E r = I? = (environment unit operator). (3) 

r 

The probability of obtaining result r is given by 

Pr = tr(ptotal-Er) , (4) 
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and the system state after a measurement that yields result r is 

tr,f (ptotal-Er) 

Pr = • (5) 

Pr 

By summing over r and using the completeness of the POVM, we obtain 

^PrPr = tr £ -(p t otal) = P ■ (6) 

r 

The ensemble {p r ,Pr} forms a decomposition of p. To characterize the ensemble, we 
define the system entropy conditional on measurement outcome r, 

H r = -tr(p r logp r ) , (7) 

the average conditional entropy, 

H = J2PrH r , (8) 

r 

and the average entropy decrease due to the measurement, AH = H — H. These 
quantities obey the double inequality 

0<AH<H, (9) 



which follows from the concavity of the von Neumann entropy |L2|, [L3fl. The content of 
the first of these inequalities is that a measurement on the environment will not, on the 
average, increase the system entropy. 

Now let {p r ,Pr} be the ensemble induced by the POVM {E r }. We denote by 
I(pfc|ptotai; {Er}) the conditional algorithmic information to specify pk, given the ensem- 



ble (see |14| and references in |0§). The quantity I(/3fc|p t otai, {E r }) defines the preparation 
information of the state p^, given the total state /3 totE j and the POVM. We also define 
the average preparation information 

/(Ptotal, {K}) = -J^Pr l °SPr ■ (10) 
r 

This definition is justified, because the average algorithmic information can be bounded 
above and below as follows: jn| 



-^2p r logp T <^2p k I{Pk\Ptouh{E r }) < -X^logp r + l • (11) 

r k r 

The average preparation information is never smaller than the average system entropy 
decrease AH, 

I(p totah {E r })>AH. (12) 
This inequality is a consequence of a general theorem about average density operators 
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The next step is to define a AH -decomposition of p as a decomposition for which 
AH > AH, and an optimal AH -decomposition of p as a Aif-decomposition with min- 
imal average preparation information /. The average preparation information for an 
optimal Aif-decomposition, 

^mm = inf. I , (13) 

AH —decompositions 

is then a property of p to tab an d characterizes the system-environment correlations. (If 
there is no AH decomposition for which I is minimal, we will call any decomposition 
optimal for which I < 7 min + e for some given small constant e.) The quantity 7 min is 
the information about the environment needed to reduce the system entropy by AH. A 
useful generalization results from taking the infimum in Eq. (|13|) over a restricted class 
of POVMs, as in the quantum-optical example of Ref. ||. This defines ensembles that 
are optimal with respect to a given class of environment measurements. 



3 Examples 

In this section, {Pj: , k — 1, . . . , Dg] denotes a complete set of orthogonal environment 
projectors. In the three examples discussed below, we will restrict the class of environ- 
ment measurements to orthogonal projections of the form 

Er = E H . (14) 

k&K r 

where K r C {1, . . . , Dg}. In all three examples, it seems intuitively clear that ensembles 
which are optimal with respect to this class of measurements are also, to a good approxi- 
mation, optimal with respect to the class of all possible environment measurements. We 
have not, however, been able to prove this statement rigorously. 



3.1 A trivial example 

Here, the system is a qubit, for which the dimension of Hilbert space is D = 2. Let 
|0) and |1) be orthogonal basis states for the qubit, define \ipi) = |0), \i/} 2 ) = |1), 
|-03> = ^(|0> + |1» and |^ 4 > = 75(|0) - |1», and let 

1 4 

Ptotal = TEl^)(^l®^f (15) 
4 k=l 

be the joint density operator of system and environment. The state of the system alone 
is then given by 

P = tr f (p total ) = ^(|0)(0| + |l)(l|), (16) 

for which the system entropy in the absence of measurements is given by H = 1. 

Suppose we want to reduce the system entropy by AH = lbit, i.e., we want the 
conditional system state to be pure. The only environment measurement achieving this 
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is given by E r = , which results in the unique and therefore optimal AH = 1 ensemble 
given by p r = \ip r )(ip r \, p r = 1/4. The average preparation information for this ensemble 
is I = 2bits, and hence J min = 2bits. 

For a different value of AH, consider the ensemble defined by p\ = + 
IfoHH), P2 = fd^X^I + 1^4) M), Pi=P2 = 1/2, which is induced by the POVM 
Et = Pf + ff and E 2 = Pi + Pf . For this ensemble, H x = H 2 = -tr(p a logpi) ~ 0.81, 
and hence AH = H — Hi ~ 0.19. The average preparation information is I = 1. 
It is easy to see that, with respect to our restricted class of measurements, this is an 
optimal AfZ-decomposition, and hence J m i n = 1. In this example, to obtain 0.19 bits of 
information about the system, 1 bit of information about the environment is needed. 



3.2 Random vectors in Hilbert space 

In the trivial example considered above, the average preparation information J min for an 
optimal ensemble is significantly larger than the corresponding entropy reduction AH. 
In the present subsection, we show that 7 m i n can vastly exceed AH. 
Assume that log Dg ^> log D and consider 

1 De 

PtotaJ = E MM ® H , (17) 
n £ k=i 



where the \ipk) are distributed randomly in D- dimensional (projective) Hilbert space [pij 



Here, the system entropy in the absence of measurements is H ~ log-D. It has been 



conjectured [0, T7] that states of a similar form arise from the interaction of a chaotic 
system with a random environment. We will see that the complexity of the resulting 
system-environment correlations, as quantified by the average preparation information, 
is very large. This is in marked contrast to the third example discussed below. 



Environment measurements of the form (|I4|) correspond to grouping the vectors \ijjf.) 



into disjoint groups. We construct an approximation to an optimal measurement by 
grouping the vectors into Hilbert-space spheres of radius 0. (See Ref. for a detailed 
argument.) We assume that Dg is sufficiently large so that the state vectors in each 
such sphere fill it randomly. Since all spheres are chosen to be of equal size, the average 
entropy H is equal to the entropy of one sphere, i.e., the entropy of a uniform mixture 
of states within a Hilbert-space sphere of radius 0, given by [§] 

( D ~ l 9 \ ( D ~ l 9 \ D-l 9l /sin 2 <A . , 
H = -(l-— ^sin 2 0jlog(l-— —sm 2 (f>) - — ^- sin 2 log f -^J . (18) 

The volume contained within a sphere of radius <fi in Hilbert space is (sin 0) 2 ^ D_1 ^Vb, 
where Vp is the total volume of projective Hilbert space 0. The number of spheres of 
radius <p i n -D-dimensional Hilbert space is thus (sin (f>)~ 2 ( D ~ l \ so the information needed 
to specify a particular sphere is 

Inxin - /min = log ((sin 0)" 2 ^- 1 )) =-(£>- 1) log(sin 2 0) . (19) 



The information I min slightly underestimates the actual value of 7 m i n , because the perfect 
grouping into nonoverlapping spheres of the same size assumed by Eq. (|i~9|) does not exist. 
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As an example, let us choose a Hilbert-space dimension D = 101 and a radius of 
~ 1.025. Equations C [l8| , |T9j) then give AH = \ogD — H ~ 1 and / min ~ 45.3, which 



means that here, to obtain 1 bit of information about the system, more than 45 bits of 
information about the environment are needed. 

Using Eq. ( |i~9"D to eliminate from Eq. ([3~8| ) gives a complicated expression for J min 
as a function of AH ||18|1 , which is a good approximation to the average preparation 
information for an optimal AH ensemble. Figure 1 shows this function for a Hilbert 
space dimension D — 101. To obtain more insight into the properties of this curve, we 



consider the derivative [IT? 



dAH ~ sin 2 01n(l + Dcot 2 0) ' 1 ' 

which is the marginal tradeoff between between information and entropy. For near 
7r/2, so that e = tt/2 — (f) <C 1, the information becomes 7 m i n = (D — l)e 2 / In 2, and the 



derivative (20) can be written as 



dAH ~ ln(l + De 2 ) ' 1 ) 

which is proportional to D with a slowly varying logarithmic correction. We have thus 
identified a situation where the average preparation information is of the same order as 
the dimension of Hilbert space D, despite the fact that the von Neumann entropy of a 
state cannot exceed logD. 

3.3 Random coherent states 

In this example the system considered is a spin-j particle, for which the dimension of 
Hilbert space is D = 2j+l. As in the preceding section, we assume that the Hilbert-space 
dimension of the environment is much larger than D, Dg ^> D, and consider 

I D £ 

Pu** = Tr 'Z\'tl>k)('tJ>k\®Pk , ( 22 ) 
n £ k=i 

but now we choose the \ipk) to be distributed randomly on the submanifold of angular- 
momentum coherent states fl24"|). We will see that the resulting complexity of the system- 
environment correlations, as quantified by the average preparation information, is small. 

The angular momentum coherent state \9, <fi) can be defined by rotating the J z eigen- 
state \j;j) through Euler angles <fi around the z-axis, and then by 9 around the new 
?/-axis. This gives [ TS | 




9,<t>)= E \r,m)[ .y m cosi +m (9/2)smi- m (9/2)e-^ . (23) 



Each coherent states corresponds to a point on the surface of a three-dimensional sphere. 
Assuming that is sufficiently large the state of the system alone is 

P = tr £ (p t otai) = ^ r r \ e ><f>)( e ><f>\ ^9d9d(j> . (24) 
An jo Jo 
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As in the previous section, environment measurements of the form (FJ) correspond 
to grouping the vectors \ipk) into disjoint groups. Approximately optimal measurements 
correspond to grouping the vectors into approximately equal, compact areas on the 
surface of the sphere. We choose the areas to be of the form 

Q r (Q) = {6,(j): arccos[n(fl, 0)n(0 r , (j) r )\ < 9} (25) 

centered at points (9 r ,(f) r ). The corresponding density operators 

In ri e)\0^)(e,(l>\sm9ded ( j ) 

Pr 6 = r • qjqj A > 26 

In r (e) sm0dd d( P 

can be used to construct a nearly optimal decomposition of p. The preparation informa- 
tion 7 min is then approximately given by 

-Tmin ^ Imin = f log — rrr , (27) 

27r(l — cos 9) 

where the denominator is the area of Q r (Q) 

In the following section, we show that p r (9), in the coordinates where (9 r , <p r ) = (0, 0), 
can be written in the diagonal form 

Pr(©)= E \j;rn)\®(j;m\ , (28) 

m=—j 

where 

~ (2j)\ sin 2 ^" 1 ) | 2 9, , N 

A ° = c\ \u- fTu F ^ - m, j - m + 1, j - m + 2, sin 2 - , 29 

(j + my. (j — m + 1)! 2 

and where F is the hypergeometric function 2 -Pi- Since all density operators /S r (9) in 
the decomposition of p have the same entropy, the average entropy H can be written as 

^ = -EA>gA^. (30) 

m=-j 

For the entropy of the system, Eq. (g), in the absence of measurements, p = p r (7r), we 
have (see Eq. [14]) 

# = - E A^logA- = log(2j + 1) = log£> . (31) 

m=-j 

We can analyse H in detail for the important special value 9 = tt/2, for which the 
measurement corresponds to a grouping into two disjoint hemispheres, and is therefore 
strictly optimal. For such a grouping, J min = 1. In the next section, we show that the 
eigenvalues A^/ 2 obey the bounds 

0< Ki 2 <-^^/\ m<-l-f/ 3 , 
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1 _ e - m _ 4 -i) < A -/2 < I m > 1 + jV3 . (32 ) 

Using these bounds we have derived the following asymptotic expression for the average 
entropy: 

H = - £ A# 2 log Ajf = logj + 0(fje~^ 3 ) , (33) 

m=-j 

and hence, in the limit j — > oo, 

► 1. (34) 



AH log(2j + 1) - H 

Figure 2 shows a parametric plot of the average preparation information J min versus 
the average entropy reduction AH = H — H for j = 50, i.e., D=101. It can be seen that 
for moderate values of AH, J min ~ AH. To reduce the system entropy by 1 bit, not more 
than approximately 1 bit of information about the environment is needed. This should 
be compared to the previous example, where the required environment information is of 
the same order as D. In the limit of AH approaching its maximum value H = log D, the 
information J min diverges. This is due to the fact that an infinite amount of information 
is needed to specify a general state exactly. The complexity of the system-environment 
correlations is characterized by the slope of the curve for small values of AH rather than 
its asymptotic behaviour for AH — > log-D. 



4 Mathematical details 

Our task is to calculate the eigenvalues of p r (0) given by Eq. (p9|) and to derive the 
expression (j32|) for the case 6 = n/2. Choosing the coordinate system such that 
(9 r , <fi r ) = (0, 0), we have 

. J 2 * f o \6,<fi) (6, <fi\ sin6 d6d<fi 

pr{&) = 2.(1 -cose) • (35) 

Substitution of (|23"D and subsequent integration over <fi gives 

Pr(6) = - ^— ]T \j;m)(j;m\( J ) A e (j + m, j - m) , (36) 

1 - cos e V 3 + m J 



m=-3 

where 



A© (p, q ) = f° /2 cos 2p+1 "& sin 2q+1 cZ0 . (37) 

Since p r (6) is diagonal in the \j; m) basis, the task of finding the eigenvalues is equivalent 
to the problem of evaluating the integral A e (p, q). Consider the integrand 

cos 2p+1 sin 29+1 •& = -- cos 2p 0(1 - cos 2 0)*^^ . (38) 

2 v ' d$ y ' 
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Using the formula |2(J 

(l + z) n = F(-n,b;b;-z) , 
where F is the hypergeometric function 2-F1, we obtain 

cos 2p+1 tfsin 2g+1 tf = --(cos 2 $) p F{-q,b; 6; cos 2 ??) " 



We now use 21 



d" 



z c ~ 1 F{a,b] c; z)\ 



Tic) 



„c— n— 1 



d,?™ /J T(c — n) 

with a = —q, c = b + 1 and 3 = cos 2 1? to get 



F(a, b;c — n; z), 



&(cos 2 tf) 6_1 F(-g,&; 6; cos 2 tf) 



d 



d cos 2 $ 



Comparing ( 4"0"|) and ( fE|) we see that, choosing b = p + 1, we find 

1 d 



cos 2p+1 tfsin 29+1 tf 



2(p + 1) d$ 



[cos 2p+2 ^F(-g,p + l;p + 2;cos 2 



and hence, using |20[| 



one obtains 
A e (p,g) : 



p! g! 



T(c)T(c-a-b) 
T(c-a)T(c-b) 



2-(p + g+l)! 2(p + l 



cos 2 P +2 e q 

2 F(-g,p + l;p + 2;cos 2 — ) , g > -1 



(39) 



(40) 



(41) 



[(cos 2 tf)"F(-g, 6; 6 + 1; cos 2 #)]. (42) 



(43) 



(44) 



(45) 



To simplify the above equation we return to the definition (|37D and split the integration 
to get 



A e (p, g ) = jj cos 2p+1 # sin 2<?+1 dtf + ^ cos 2p+1 sin 2g+1 dtf . (46) 

The first integral is proportional to the beta function and the second integral can be 
transformed into A 7T ^ (q,p) by substitution $ — > ir/2 — -d so that 



A e (p,g) 



p! g! 



-A*- e ( g ,p). 



2-(p + g + l)! 
Using this formula Eq. ( fi5]) can be transformed into 



(47) 



A e (p,g) 



sin 



2q+2 G 



e. 



— rF(-p, q + 1; g + 2; sin 2 - 

2(g+l) v ' y ' 2 



(48) 



Expression (^) for p r (0) can now be rewritten in the compact form of Eq. (|2£ 
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To investigate the case 6 = ir/2, we calculate directly 

Af (p q ) = [* cos 2p+1 ■& sin 2q+1 ■& dti . (49) 
Jo 



Substitution of t = tan 2 1? gives 

Af (p q) = - C t q (l + t)-&*9+2) dt. (50) 
2 jo 

Using the integral representation of the hypergeometric function ||21|| , 

F(a, 6; c; z) = — — ^ — - t^fl - tY'^il - tz)~ a dt , 3?c> m > , (51) 
.B(o, c — b) Jo 

we find 

A S, v F(p + q + 2,q + l;q + 2;-l) 

?) = 2{q~TT) ' " > ~ l ' (52) 

which is a rather compact expression. We can obtain additional insight in the following 
way. Consider the Gauss formula for the so-called contiguous functions F(a, b — 1; c; z) 
and F(a,b;c+l;z) PJ 



c(l — z)F(a, b; c; z) — oF(a, ft — 1; c; 2) + (c — a)zF(a, b; c + 1; z) = . (53) 
For the case b = c and z = — 1 we get using ( p9| ) 

F(a, 6; b + 1; -1) = a(b)F(a, 6 - 1; 6; -1) + (3(b), (54) 

where 

a(b) = — — - , /?(&) = -2 1 - a «(6) . (55) 



Iteration of (|54]) 6—1 times gives 

6-2 6-2 s-1 

F{a, b;b + l; -1) = F(a, 1; 2; -1) J| a(6 - k) + X] «( & - s ) II P( b ~ k ) ■ ( 56 ) 

k=0 s=0 k=0 

Noticing that 

tt n. m 6! (a -6-1)! , C _ N 

FT a (b - k ) = - K — — '- 57 

tJo (b-s-l)\(a-b + s)\ 

and substituting x = b — 1 — s, we obtain 



6!(a-6-l)! r/ <x „ i-^/a-l 

;r 



F(a, 6; 6 + 1; -1) = — ±-[(a - l)F(a, 1; 2; -1) - 2 1 - £ 

(a-l)< x=i 

The value of F(a, 1; 2; — 1) can be calculated using the integral representation 



(58) 



1 - 2 1 "" 

F(a,l;2;-1)= — , (59) 



10 



and hence we find 

Using this formula, Eq. ([52]) can be rewritten as 

^•^^Tiy( ? ; 9 )" 1 i i -^ +i, s( ?+ r 1 )i- <«> 

Using P7| ) we have 

A% 9 ) = - -V P + ^ + 1 (62) 

and therefore 

Ai^K 2 ^ 1 )^ (03) 
The sum on the right hand side is the univariate cumulative distribution function [ 2~I|| 

n, p) = 51 ^(^! ™, P) (64) 

x=0 



for the binomial distribution 



P(x;n lP )=l I )p a {l-p) n ~, (65) 



x 



where n — 2j + 1, p — 1/2 and y = j + m. In the limit j — > oo, G(y;n,p) approaches 
the step function 

, lim^so ? < , 



.7 



lim G(j + m; 2j + 1, 1/2) = ^ § , m = , (66) 

' 1, lim,- S >0. 



To obtain the behaviour for large but finite values of j, we use the Chernoff bounds [22 



£ P(x;n lP ) < e~^l\ 

x<np(l— e) 

]T P(x;n,p) < e - e2np/3 , (67) 

x>np(l+e) 

which are valid for < e < 1. Choosing e = j -1 ^ 3 , we obtain Eq. (|32"|) , in agreement 
with Eq. (m). 
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Figure 1: Average preparation information J m i n in bits, versus average entropy reduction 



AH in bits, for the example of subsection 3.2 
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Figure 2: Average preparation information J min in bits, versus average entropy reduction 



AH in bits, for the example of subsection pT3 
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